Goto

Collaborating Authors

 Santa Barbara Channel




Enhancing kelp forest detection in remote sensing images using crowdsourced labels with Mixed Vision Transformers and ConvNeXt segmentation models

Nasios, Ioannis

arXiv.org Artificial Intelligence

Kelp forests, as foundation species, are vital to marine ecosystems, providing essential food and habitat for numerous organisms. This study explores the integration of crowdsourced labels with advanced artificial intelligence models to develop a fast and accurate kelp canopy detection pipeline using Landsat images. Building on the success of a machine learning competition, where this approach ranked third and performed consistently well on both local validation and public and private leaderboards, the research highlights the effectiveness of combining Mixed Vision Transformers (MIT) with ConvNeXt models. Training these models on various image sizes significantly enhanced the accuracy of the ensemble results. U-Net emerged as the best segmentation architecture, with UpperNet also contributing to the final ensemble. Key Landsat bands, such as ShortWave InfraRed (SWIR1) and Near-InfraRed (NIR), were crucial while altitude data was used in postprocessing to eliminate false positives on land. The methodology achieved a high detection rate, accurately identifying about three out of four pixels containing kelp canopy while keeping false positives low. Despite the medium resolution of Landsat satellites, their extensive historical coverage makes them effective for studying kelp forests. This work also underscores the potential of combining machine learning models with crowdsourced data for effective and scalable environmental monitoring. All running code for training all models and inference can be found at https://github.com/IoannisNasios/Kelp_Forests.


GRAM: Global Reasoning for Multi-Page VQA

Blau, Tsachi, Fogel, Sharon, Ronen, Roi, Golts, Alona, Ganz, Roy, Avraham, Elad Ben, Aberdam, Aviad, Tsiper, Shahar, Litman, Ron

arXiv.org Artificial Intelligence

The increasing use of transformer-based large language models brings forward the challenge of processing long sequences. In document visual question answering (DocVQA), leading methods focus on the single-page setting, while documents can span hundreds of pages. We present GRAM, a method that seamlessly extends pre-trained single-page models to the multi-page setting, without requiring computationally-heavy pretraining. To do so, we leverage a single-page encoder for local page-level understanding, and enhance it with document-level designated layers and learnable tokens, facilitating the flow of information across pages for global reasoning. To enforce our model to utilize the newly introduced document-level tokens, we propose a tailored bias adaptation method. For additional computational savings during decoding, we introduce an optional compression stage using our C-Former model, which reduces the encoded sequence length, thereby allowing a tradeoff between quality and latency. Extensive experiments showcase GRAM's state-of-the-art performance on the benchmarks for multi-page DocVQA, demonstrating the effectiveness of our approach.


PETAL: Physics Emulation Through Averaged Linearizations for Solving Inverse Problems

Jin, Jihui, Ollivier, Etienne, Touret, Richard, McKinley, Matthew, Sabra, Karim G., Romberg, Justin K.

arXiv.org Artificial Intelligence

Inverse problems describe the task of recovering an underlying signal of interest given observables. Typically, the observables are related via some non-linear forward model applied to the underlying unknown signal. Inverting the non-linear forward model can be computationally expensive, as it often involves computing and inverting a linearization at a series of estimates. Rather than inverting the physics-based model, we instead train a surrogate forward model (emulator) and leverage modern auto-grad libraries to solve for the input within a classical optimization framework. Current methods to train emulators are done in a black box supervised machine learning fashion and fail to take advantage of any existing knowledge of the forward model. In this article, we propose a simple learned weighted average model that embeds linearizations of the forward model around various reference points into the model itself, explicitly incorporating known physics. Grounding the learned model with physics based linearizations improves the forward modeling accuracy and provides richer physics based gradient information during the inversion process leading to more accurate signal recovery. We demonstrate the efficacy on an ocean acoustic tomography (OAT) example that aims to recover ocean sound speed profile (SSP) variations from acoustic observations (e.g.


Perceptron: AI saving whales, steadying gaits and banishing traffic

#artificialintelligence

Research in the field of machine learning and AI, now a key technology in practically every industry and company, is far too voluminous for anyone to read it all. This column, Perceptron, aims to collect some of the most relevant recent discoveries and papers -- particularly in, but not limited to, artificial intelligence -- and explain why they matter. Over the past few weeks, researchers at MIT have detailed their work on a system to track the progression of Parkinson's patients by continuously monitoring their gait speed. Elsewhere, Whale Safe, a project spearheaded by the Benioff Ocean Science Laboratory and partners, launched buoys equipped with AI-powered sensors in an experiment to prevent ships from striking whales. Other aspects of ecology and academics also saw advances powered by machine learning.


How Artificial Intelligence is being used to save whales

#artificialintelligence

Smartphones, like many consumer products, arrive in the US on giant container ships, vessels that are leading killers of endangered whales that play crucial roles in the climate and ocean health. Now a high-tech initiative called Whale Safe is detecting the huge marine mammals off the coast of San Francisco and alerting ship captains to slow down to avoid deadly collisions. Launched on Wednesday, Whale Safe aims to create "school zones" for imperiled blue whales, fin whales and humpback whales in busy shipping lanes, according to the project's managers at the Benioff Ocean Science Laboratory at the University of California at Santa Barbara and at the Bay Area's Marine Mammal Center. Speeders are caught by satellite surveillance and cited online. That gives consumers the opportunity to see, for instance, if that cruise they're contemplating is operated by a company with a history of ignoring sea speed limits.


How Artificial Intelligence is being used to save whales

#artificialintelligence

Smartphones, like many consumer products, arrive in the US on giant container ships, vessels that are leading killers of endangered whales that play crucial roles in the climate and ocean health. Now a high-tech initiative called Whale Safe is detecting the huge marine mammals off the coast of San Francisco and alerting ship captains to slow down to avoid deadly collisions. Launched on Wednesday, Whale Safe aims to create "school zones" for imperilled blue whales, fin whales and humpback whales in busy shipping lanes, according to the project's managers at the Benioff Ocean Science Laboratory at the University of California at Santa Barbara and at the Bay Area's Marine Mammal Center. Speeders are caught by satellite surveillance and cited online. That gives consumers the opportunity to see, for instance, if that cruise they're contemplating is operated by a company with a history of ignoring sea speed limits.


Deep Science: Robot perception, acoustic monitoring, using ML to detect arthritis – TechCrunch

#artificialintelligence

Research papers come out far too rapidly for anyone to read them all, especially in the field of machine learning, which now affects (and produces papers in) practically every industry and company. This column aims to collect the most relevant recent discoveries and papers -- particularly in but not limited to artificial intelligence -- and explain why they matter. The topics in this week's Deep Science column are a real grab bag that range from planetary science to whale tracking. There are also some interesting insights from tracking how social media is used and some work that attempts to shift computer vision systems closer to human perception (good luck with that). One of machine learning's most reliable use cases is training a model on a target pattern, say a particular shape or radio signal, and setting it loose on a huge body of noisy data to find possible hits that humans might struggle to perceive.


Training a U-Net based on a random mode-coupling matrix model to recover acoustic interference striations

Li, Xiaolei, Song, Wenhua, Gao, Dazhi, Gao, Wei, Wan, Haozhong

arXiv.org Machine Learning

A U-Net is trained to recover acoustic interference striations (AISs) from distorted ones. A random mode-coupling matrix model is introduced to generate a large number of training data quickly, which are used to train the U-Net. The performance of AIS recovery of the U-Net is tested in range-dependent waveguides with nonlinear internal waves (NLIWs). Although the random mode-coupling matrix model is not an accurate physical model, the test results show that the U-Net successfully recovers AISs under different signal-to-noise ratios (SNRs) and different amplitudes and widths of NLIWs for different shapes.